AITopics | model family

adc98a266f45005c403b8311ca7e8bd7-Paper-Conference.pdf

Neural Information Processing SystemsApr-29-2026, 10:08:29 GMT

artificial intelligence, machine learning, natural language, (14 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

CogFormer: Learn All Your Models Once

Huang, Jerry M., Schumacher, Lukas, Stevenson, Niek, Radev, Stefan T.

arXiv.org Machine LearningMar-24-2026

Simulation-based inference (SBI) with neural networks has accelerated and transformed cognitive modeling workflows. SBI enables modelers to fit complex models that were previously difficult or impossible to estimate, while also allowing rapid estimation across large numbers of datasets. However, the utility of SBI for iterating over varying modeling assumptions remains limited: changing parameterizations, generative functions, priors, and design variables all necessitate model retraining and hence diminish the benefits of amortization. To address these issues, we pilot a meta-amortized framework for cognitive modeling which we nickname the CogFormer. Our framework trains a transformer-based architecture that remains valid across a combinatorial number of structurally similar models, allowing for changing data types, parameters, design matrices, and sample sizes. We present promising quantitative results across families of decision-making models for binary, multi-alternative, and continuous responses. Our evaluation suggests that CogFormer can accurately estimate parameters across model families with a minimal amortization offset, making it a potentially powerful engine that catalyzes cognitive modeling workflows.

artificial intelligence, machine learning, simulation of human behavior, (16 more...)

arXiv.org Machine Learning

2603.2052

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
North America > United States > New York > Rensselaer County > Troy (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Simulation of Human Behavior (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

\textit{Trans-LoRA} : towards data-free Transferable Parameter Efficient Finetuning

Neural Information Processing SystemsMar-21-2026, 02:37:56 GMT

Low-rank adapters (LoRA) and their variants are popular parameter-efficient fine-tuning (PEFT) techniques that closely match full model fine-tune performance while requiring only a small number of additional parameters. These additional LoRA parameters are specific to the base model being adapted. When the base model needs to be deprecated and replaced with a new one, all the associated LoRA modules need to be re-trained. Such re-training requires access to the data used to train the LoRA for the original base model. This is especially problematic for commercial cloud applications where the LoRA modules and the base models are hosted by service providers who may not be allowed to host proprietary client task data. To address this challenge, we propose $\textit{Trans-LoRA}$ --- a novel method for lossless, nearly data-free transfer of LoRAs across base models. Our approach relies on synthetic data to transfer LoRA modules. Using large language models, we design a synthetic data generator to approximate the data-generating process of the $\textit{observed}$ task data subset.

artificial intelligence, cloud computing, natural language, (9 more...)

Neural Information Processing Systems

Technology:

Information Technology > Cloud Computing (0.59)
Information Technology > Artificial Intelligence > Natural Language (0.59)

Add feedback

adc98a266f45005c403b8311ca7e8bd7-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 12:59:50 GMT

artificial intelligence, machine learning, natural language, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

83e77607638c4fb17fba4a9b7844800c-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 06:38:07 GMT

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry: Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.95)
(3 more...)

Add feedback

RevisitingtheCalibrationof ModernNeuralNetworks

Neural Information Processing SystemsFeb-9-2026, 15:16:57 GMT

These concerns are more relevant than ever, since the architecture size, amount of training data, and computing power used by state-of-the-art models continue to increase.

artificial intelligence, calibration, machine learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

1cded4f97cf5f01a284c574110b7e3b9-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 23:29:42 GMT

arxiv preprint arxiv, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Information Technology (0.45)
Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)

Add feedback

5c528e25e1fdeaf9d8160dc24dbf4d60-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-8-2026, 13:35:20 GMT

algorithm regularizer, reviewer, training mlp, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.31)

Add feedback

A Data-driven Typology of Vision Models from Integrated Representational Metrics

Wu, Jialin, Saha, Shreya, Bo, Yiqing, Khosla, Meenakshi

arXiv.org Artificial IntelligenceDec-10-2025

Large vision models differ widely in architecture and training paradigm, yet we lack principled methods to determine which aspects of their representations are shared across families and which reflect distinctive computational strategies. We leverage a suite of representational similarity metrics, each capturing a different facet-geometry, unit tuning, or linear decodability-and assess family separability using multiple complementary measures. Metrics preserving geometry or tuning (e.g., RSA, Soft Matching) yield strong family discrimination, whereas flexible mappings such as Linear Predictivity show weaker separation. These findings indicate that geometry and tuning carry family-specific signatures, while linearly decodable information is more broadly shared. To integrate these complementary facets, we adapt Similarity Network Fusion (SNF), a method inspired by multi-omics integration. SNF achieves substantially sharper family separation than any individual metric and produces robust composite signatures. Clustering of the fused similarity matrix recovers both expected and surprising patterns: supervised ResNets and ViTs form distinct clusters, yet all self-supervised models group together across architectural boundaries. Hybrid architectures (ConvNeXt, Swin) cluster with masked autoencoders, suggesting convergence between architectural modernization and reconstruction-based training. This biology-inspired framework provides a principled typology of vision models, showing that emergent computational strategies-shaped jointly by architecture and training objective-define representational structure beyond surface design categories.

artificial intelligence, machine learning, representation, (19 more...)

arXiv.org Artificial Intelligence

2509.21628

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.94)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

Add feedback

Measuring the Measures: Discriminative Capacity of Representational Similarity Metrics Across Model Families

Wu, Jialin, Saha, Shreya, Bo, Yiqing, Khosla, Meenakshi

arXiv.org Artificial IntelligenceDec-10-2025

Representational similarity metrics are fundamental tools in neuroscience and AI, yet we lack systematic comparisons of their discriminative power across model families. We introduce a quantitative framework to evaluate representational similarity measures based on their ability to separate model families-across architectures (CNNs, Vision Transformers, Swin Transformers, ConvNeXt) and training regimes (supervised vs. self-supervised). Using three complementary separability measures-dprime from signal detection theory, silhouette coefficients and ROC-AUC, we systematically assess the discriminative capacity of commonly used metrics including RSA, linear predictivity, Procrustes, and soft matching. We show that separability systematically increases as metrics impose more stringent alignment constraints. Among mapping-based approaches, soft-matching achieves the highest separability, followed by Procrustes alignment and linear predictivity. Non-fitting methods such as RSA also yield strong separability across families. These results provide the first systematic comparison of similarity metrics through a separability lens, clarifying their relative sensitivity and guiding metric choice for large-scale model and brain comparisons.

artificial intelligence, machine learning, representation, (14 more...)

arXiv.org Artificial Intelligence

2509.04622

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.50)

Technology: